%  Copyright (C) 2002 Regents of the University of Michigan, portions used with permission 
%  For more information, see http://csem.engin.umich.edu/tools/swmf
\section{Hardware and Software Requirements \label{section:requirements}}

\subsection{Source Code and Compilers \label{section:compiler}}

\BATSRUS\ is written completely in standard Fortran 90 with calls to
standard MPI libraries.  You must assure that both of these are available
in order to
use \BATSRUS.  Compilers seem to be readily available for
F90, although there are very few free F90 compilers.  Most parallel
machines have MPI libraries in residence, although most Linux
distributions do not include F90 versions of the MPI libraries.  The
code has been run with the free distribution of the MPI-CH libraries
under Linux. \BATSRUS\ has been run successfully on the following platforms:
Cray T3E, IBM SP, SGI Origin, Altix, Darwin (Mac OS) and several
different Beowulf clusters.

If the user wishes to run \BATSRUS\ on a single node, no MPI libraries
are required to be in residence.  The user must make the {\tt NOMPI}
library which is included in the util/NOMPI utility
(see section~\ref{section:main_make})

\subsection{Parallelism, Speed Up and Scaling \label{scaling}}

While it is designed to run on massively parallel platforms, the
code will run on any number of nodes from a single processor to as
many as the user has access to. We have run \BATSRUS\ on up
to half a million CPU cores using a hybrid MPI + OpenMP approach.
As described in the DESIGN document, blocks are distributed on different
processors and must communicate with each other through message passing.
The communication time between processors, or latency, will have an 
effect on the efficiency of the code.

Two ways to measure a code's performance are to look at ``speed up''
or strong scaling and ``weak scaling''.
Speed up is measured by taking a fixed size problem and running
it on more processors.  If the code runs twice as fast when run on twice as 
many processors then the code has perfect speed up.  Communication time is 
likely a major part of the work load if more processors does not mean
more speed.    
Scalability is measured by doubling the size of the problem and the number
of processors at the same time and asking if running the code takes the
same amount of time as before.  
Both are important measures of code performance.

\subsection{Memory and Disk Requirements \label{section:memory}}

\BATSRUS\ is a block based simulation code (see the DESIGN document)
and as such, memory requirements are most easily discussed in terms 
of the number of blocks in a simulation.

To establish the computation requirements needed to run the code, the
user must determine how large of a simulation region 
and what resolution is needed for that run.  This will
establish the total number of simulation blocks needed,
which will in turn determine the memory requirements of the system.
Table~\ref{tab:memory} gives examples of a number of different simulations
which are commonly run, and their memory requirements.  
Note that the memory requirement depends on the size of real numbers 
used at compilation time. This can be changed by changing compile flags
in the makefiles.  For a 4 byte real, each 4x4x4 cell block takes 
approximately 0.2 MB of memory.  For an 8 byte real, each 4x4x4 cell block takes
approximately 0.4 MB of memory.  The trade-off between the two are discussed
in section~\ref{section:compiler_flags}.

For example, simulations of the Earth's magnetosphere are typically
run with between 250,000 cells and 1,000,000 cells and with a 4x4x4 
grid structure within each block.  This means that the simulations 
are run with between 4000 and
16000 blocks.  Using a 4 byte real number, each block takes approximately 
0.2 MB of memory.
Therefore the simulations would require between 850 and 3300 MB of
memory.  This memory can obviously be distributed across all of the
CPUs which are available to the run.  So, if these runs were conducted
on a 16 node computer, each node would have to have approximately 200
MB of memory (for the large run).  On a different computer, the memory
per node would scale accordingly.

The amount of disk space required to run the code varies dramatically
between different types of simulations, time and spatial resolution
required, and the number of variables to be saved for plotting.  
Specifically, users need to be aware of the size of restart files and
the size of output files for plotting.

Restart files, used to restart the code from the middle of a simulation,
store the entire three dimensional cell centered data used by the code.
Because the code has other overhead, this number is smaller than the memory
needed when actually running the code.  The restart files are roughly
one tenth (1/10) the size of the code image in memory.

Output files for plotting vary greatly in size depending on the nature of
the output and the frequency of the output.  Full 3-D output files of
the entire simulation domain, not yet post processed, can be
a factor of 2 or more larger than the restart files because they contain more
variables and are not binary.  
One common method for saving disk space is to only write out cut planes and not
the full 3-D simulation region.  This has the advantage of saving vast
amounts of disk space, but reduces the amount of physical insight
which can be gained from the simulation.  Typical (binary) cut plane
files are approximately 1.0 MB once they are post processed.  
The biggest influence on the amount of memory required is the frequency
at which plot files are written during a run.  Clearly, 
if data is output often (say every 1-30 seconds of simulation
time of a 10 hour simulation), the amount of data can grow astronomically.


\begin{table*}[t]

\caption{Examples of memory requirements for some commonly
run simulation types.}

\begin{centering}
{\small
\begin{tabular}{|l|c|c|c|c|} \hline

Simulation  & Number of  Bytes  & Minimum Number  & Maximum Number & Total Memory \\
            & for a Real Number & of cells        & of cells       & for Maximum  \\ 
\hline
Saturn      &  8                & 600,000         & 1,000,000      & 6.6 GB      \\
Heliosphere &  8                & 500,000         & 2,000,000      & 13.2 GB      \\
Earth       &  8                & 250,000         & 1,000,000      & 6.6 GB      \\ 
Earth       &  4                & 250,000         & 1,000,000      & 3.3 GB      \\ 
\hline

\end{tabular}}

\end{centering}

\label{tab:memory}
\end{table*} 
